Sequence Modeling via Segmentations

نویسندگان

  • Chong Wang
  • Yining Wang
  • Po-Sen Huang
  • Abdel-rahman Mohamed
  • Dengyong Zhou
  • Li Deng
چکیده

Segmental structure is a common pattern in many types of sequences such as phrases in human languages. In this paper, we present a probabilistic model for sequences via their segmentations. The probability of a segmented sequence is calculated as the product of the probabilities of all its segments, where each segment is modeled using existing tools such as recurrent neural networks. Since the segmentation of a sequence is usually unknown in advance, we sum over all valid segmentations to obtain the final probability for the sequence. An efficient dynamic programming algorithm is developed for forward and backward computations without resorting to any approximation. We demonstrate our approach on text segmentation and speech recognition tasks. In addition to quantitative results, we also show that our approach can discover meaningful segments in their respective application contexts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Functional Assessment of CODM Gene in Different Cultivar of Papaveraceous Species Via In Silico Analysis

Medicinal use of the opium poppy (Papaver somniferum L) has ancient history, but the isolation of morphine was not described until the early nineteenth century. Morphine is the most important alkaloid of opium poppy in the last 50 years. In the morphine pathway has been reported to generate morphine in this species, CODM has a crucial role as the gene coding the enzyme respons...

متن کامل

Functional Assessment of CODM Gene in Different Cultivar of Papaveraceous Species Via In Silico Analysis

Medicinal use of the opium poppy (Papaver somniferum L) has ancient history, but the isolation of morphine was not described until the early nineteenth century. Morphine is the most important alkaloid of opium poppy in the last 50 years. In the morphine pathway has been reported to generate morphine in this species, CODM has a crucial role as the gene coding the enzyme respons...

متن کامل

Vers une modélisation statistique multi-niveau du langage, application aux langues peu dotées. (Toward a multi-level statistical language modeling for under-resourced language)

This PhD thesis focuses on the problems encountered when developing automatic speech recognition for under-resourced languages with a writing system without explicit separation between words. The specificity of the languages covered in our work requires automatic segmentation of text corpus into words in order to make the n-gram language modeling applicable. While the lack of text data has an i...

متن کامل

Problems and Algorithms for Sequence Segmentations

The analysis of sequential data is required in many diverse areas such as telecommunications, stock market analysis, and bioinformatics. A basic problem related to the analysis of sequential data is the sequence segmentation problem. A sequence segmentation is a partition of the sequence into a number of non-overlapping segments that cover all data points, such that each segment is as homogeneo...

متن کامل

Refinement of Object-Based Segmentation

Joshua Howard Levy: Refinement of Object-Based Segmentation (Under the direction of Stephen M. Pizer) Automated object-based segmentation methods calculate the shape and pose of anatomical structures of interest. These methods require modeling both the geometry and object-relative image intensity patterns of target structures. Many object-based segmentation methods minimize a non-convex functio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017